AITopics | quadratic convergence

Collaborating Authors

quadratic convergence

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Distributed Multitask Reinforcement Learning with Quadratic Convergence

Neural Information Processing SystemsNov-20-2025, 22:31:52 GMT

Multitask reinforcement learning (MTRL) suffers from scalability issues when the number of tasks or trajectories grows large. The main reason behind this drawback is the reliance on centeralised solutions. Recent methods exploited the connection between MTRL and general consensus to propose scalable solutions. These methods, however, suffer from two drawbacks. First, they rely on predefined objectives, and, second, exhibit linear convergence guarantees. In this paper, we improve over state-of-the-art by deriving multitask reinforcement learning from a variational inference perspective. We then propose a novel distributed solver for MTRL with quadratic convergence guarantees.

multitask reinforcement learning, name change, quadratic convergence, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.58)

Add feedback

Export Reviews, Discussions, Author Feedback and Meta-Reviews

Neural Information Processing SystemsOct-3-2025, 04:36:22 GMT

First provide a summary of the paper, and then address the following criteria: Quality, clarity, originality and significance. This paper investigate fast convergence properties of proximal gradient method and proximal Newton method under the assumption of Constant Nullspace Strong Convexity (CNSC). The problem of interest is to minimize the sum of two convex functions f(x)+h(x), where f is twice differentiable (smooth) and h can be non-smooth but admits a simple proximal mapping. Under the CNSC assumption on f and assuming h has the form of decomposable norm, this paper showed global geometric convergence of the proximal gradient method, and local quadratic convergence of the proximal Newton method. Writing of this paper is very clear.

convergence, decomposable norm, newton method, (13 more...)

Neural Information Processing Systems

Country: North America > Canada > Quebec > Montreal (0.04)

Genre:

Research Report (0.93)
Overview (0.90)

Technology: Information Technology > Artificial Intelligence (0.76)

Add feedback

On Quadratic Convergence of DC Proximal Newton Algorithm in Nonconvex Sparse Learning

Xingguo Li, Lin Yang, Jason Ge, Jarvis Haupt, Tong Zhang, Tuo Zhao

Neural Information Processing SystemsOct-1-2025, 03:07:35 GMT

Numerical experiments are provided to support our theory.

algorithm, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Minnesota (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)

Genre: Research Report (0.47)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Add feedback

Predictability Enables Parallelization of Nonlinear State Space Models

Gonzalez, Xavier, Kozachkov, Leo, Zoltowski, David M., Clarkson, Kenneth L., Linderman, Scott W.

arXiv.org Machine LearningAug-26-2025

The rise of parallel computing hardware has made it increasingly important to understand which nonlinear state space models can be efficiently parallelized. Recent advances like DEER (arXiv:2309.12252) or DeepPCR (arXiv:2309.16318) have shown that evaluating a state space model can be recast as solving a parallelizable optimization problem, and sometimes this approach can yield dramatic speed-ups in evaluation time. However, the factors that govern the difficulty of these optimization problems remain unclear, limiting the larger adoption of the technique. In this work, we establish a precise relationship between the dynamics of a nonlinear system and the conditioning of its corresponding optimization formulation. We show that the predictability of a system, defined as the degree to which small perturbations in state influence future behavior, impacts the number of optimization steps required for evaluation. In predictable systems, the state trajectory can be computed in $O((\log T)^2)$ time, where $T$ is the sequence length, a major improvement over the conventional sequential approach. In contrast, chaotic or unpredictable systems exhibit poor conditioning, with the consequence that parallel evaluation converges too slowly to be useful. Importantly, our theoretical analysis demonstrates that for predictable systems, the optimization problem is always well-conditioned, whereas for unpredictable systems, the conditioning degrades exponentially as a function of the sequence length. We validate our claims through extensive experiments, providing practical guidance on when nonlinear dynamical systems can be efficiently parallelized, and highlighting predictability as a key design principle for parallelizable models.

artificial intelligence, convergence, optimization problem, (16 more...)

arXiv.org Machine Learning

2508.16817

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > California > Santa Clara County > Stanford (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(3 more...)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Add feedback

Constant Nullspace Strong Convexity and Fast Convergence of Proximal Methods under High-Dimensional Settings

Ian En-Hsu Yen, Cho-Jui Hsieh, Pradeep K. Ravikumar, Inderjit S. Dhillon

Neural Information Processing SystemsFeb-10-2025, 00:49:41 GMT

State of the art statistical estimators for high-dimensional problems take the form of regularized, and hence non-smooth, convex programs. A key facet of these statistical estimation problems is that these are typically not strongly convex under a high-dimensional sampling regime when the Hessian matrix becomes rankdeficient. Under vanilla convexity however, proximal optimization methods attain only a sublinear rate. In this paper, we investigate a novel variant of strong convexity, which we call Constant Nullspace Strong Convexity (CNSC), where we require that the objective function be strongly convex only over a constant subspace. As we show, the CNSC condition is naturally satisfied by high-dimensional statistical estimators. We then analyze the behavior of proximal methods under this CNSC condition: we show global linear convergence of Proximal Gradient and local quadratic convergence of Proximal Newton Method, when the regularization function comprising the statistical estimator is decomposable. We corroborate our theory via numerical experiments, and show a qualitative difference in the convergence rates of the proximal algorithms when the loss function does satisfy the CNSC condition.

artificial intelligence, convergence, machine learning, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Taiwan > Taiwan Province > Taipei (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

Reviews: Distributed Multitask Reinforcement Learning with Quadratic Convergence

Neural Information Processing SystemsOct-7-2024, 15:46:42 GMT

In this paper, the authors studied the problem of multitask reinforcement learning (MTRL), and propose several optimization techniques to alleviate the scalability issues observed in other methods, especially when the number of tasks or trajectories is large. Specifically, they rely on consensus algorithms to scale up MTRL algorithms and avoid the issues that exist in centralized solution methods. Furthermore, they show how MTRL algorithms can be improved over state-of-the-art benchmarks by considering the problem from a variational inference perspective, and then propose a novel distributed solver for MTRL with quadratic convergence guarantees. In general, this work is tackling some important problems in the increasingly popular domain of multi-task RL. Using the variational perspective of RL, the problem of MTRL can be cast as a variational inference problem, and policy search can be done through the minimization of the ELBO loss. To alternate the updates on variational parameters and the policy parameters, the authors also propose using EM based approaches, which is very reasonable.

algorithm, consensus algorithm, multitask reinforcement learning, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.63)

Add feedback

On Quadratic Convergence of DC Proximal Newton Algorithm in Nonconvex Sparse Learning

Xingguo Li, Lin Yang, Jason Ge, Jarvis Haupt, Tong Zhang, Tuo Zhao

Neural Information Processing SystemsOct-3-2024, 03:40:54 GMT

We propose a DC proximal Newton algorithm for solving nonconvex regularized sparse learning problems in high dimensions. Our proposed algorithm integrates the proximal newton algorithm with multi-stage convex relaxation based on the difference of convex (DC) programming, and enjoys both strong computational and statistical guarantees. Specifically, by leveraging a sophisticated characterization of sparse modeling structures (i.e., local restricted strong convexity and Hessian smoothness), we prove that within each stage of convex relaxation, our proposed algorithm achieves (local) quadratic convergence, and eventually obtains a sparse approximate local optimum with optimal statistical properties after only a few convex relaxations. Numerical experiments are provided to support our theory.

algorithm, convergence, convex relaxation, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Minnesota (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)

Genre: Research Report (0.47)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Add feedback

Constant Nullspace Strong Convexity and Fast Convergence of Proximal Methods under High-Dimensional Settings

Neural Information Processing SystemsMar-13-2024, 14:10:55 GMT

cnsc condition, constant nullspace strong convexity, convergence, (10 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Taiwan > Taiwan Province > Taipei (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

Approximate Newton policy gradient algorithms

Li, Haoya, Gupta, Samarth, Yu, Hsiangfu, Ying, Lexing, Dhillon, Inderjit

arXiv.org Artificial IntelligenceJun-8-2023

Policy gradient algorithms have been widely applied to Markov decision processes and reinforcement learning problems in recent years. Regularization with various entropy functions is often used to encourage exploration and improve stability. This paper proposes an approximate Newton method for the policy gradient algorithm with entropy regularization. In the case of Shannon entropy, the resulting algorithm reproduces the natural policy gradient algorithm. For other entropy functions, this method results in brand-new policy gradient algorithms. We prove that all these algorithms enjoy Newton-type quadratic convergence and that the corresponding gradient flow converges globally to the optimal solution. We use synthetic and industrial-scale examples to demonstrate that the proposed approximate Newton method typically converges in single-digit iterations, often orders of magnitude faster than other state-of-the-art algorithms.

artificial intelligence, convergence, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2110.02398

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > California > Santa Clara County > Stanford (0.04)
(3 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)

Add feedback

Modified Gauss-Newton Algorithms under Noise

Pillutla, Krishna, Roulet, Vincent, Kakade, Sham, Harchaoui, Zaid

arXiv.org Artificial IntelligenceMay-17-2023

The Gauss-Newton method and its variants such as the Levenberg-Marquardt method [15, 16] have been applied successfully in phase retrieval [5, 11, 20], nonlinear control [22, 24], and non-negative matrix factorization [12]. Modern machine learning problems such as deep learning possess a similar compositional structure, which makes Gauss-Newton-like algorithms potential good candidates [8, 26, 30]. However, in such problems, we are often interested in the generalization performance on unseen data. It is unclear whether the additional cost of solving the subproblems can be amortized by the superior efficiency of Gauss-Newton-like algorithms. In this paper, we investigate whether modified Gauss-Newton methods or prox-linear algorithms with incremental gradient inner loops are superior to direct stochastic subgradient algorithms for nonsmooth problems with a compositional objective and a finite-sum structure in terms of generalization error.

algorithm, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2305.10634

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback